Techniques for Use of Vendor Defined Messages to Execute a Command to Access a Storage Device

ABSTRACT

Examples are disclosed for use of vendor defined messages to execute a command to access a storage device maintained at a server. In some examples, a network input/output device coupled to the server may receive the command from a client remote to the server for the client to access the storage device. For these examples, elements or components of the network input/output device may be capable of forwarding the command either directly to a Non-Volatile Memory Express (NVMe) controller that controls the storage device or to a manageability module coupled between the network input/out device and the NVMe controller. Vendor specific information may be forwarded with the command and used by either the NVMe controller or the manageability module to facilitate execution of the command. Other examples are described and claimed.

RELATED CASES

This application is a continuation of, claims the benefit of andpriority to, previously filed U.S. patent application Ser. No.13/743,112 filed Jan. 16, 2013 entitled “TECHNIQUES FOR USE OF VENDORDEFINED MESSAGES TO EXECUTE A COMMAND TO ACCESS A STORAGE DEVICE” whichclaims priority to U.S. Provisional Patent Application No. 61/587,541,filed on Jan. 17, 2012; both of which are incorporated herein byreference.

BACKGROUND

In an example conventional computing arrangement, a client and a serverinclude respective network interface controllers (NICs) or network (NW)input/output (I/O) devices that are capable of communicating with eachother using a Remote Direct Memory Access (RDMA) protocol. The serverincludes a host processor that executes the server's operating systemand associated drivers. The server may also include a storage controllerthat manages access to storage maintained at or by the server. Theclient's NW I/O device issues requests to the server's NW I/O device towrite data to and read data from the storage maintained by the server.The server's operating system, associated drivers, and host processorprocess the requests received by the server's NW I/O device, and issuescorresponding requests to the storage controller. The storage controllerreceives and executes these corresponding requests. After executing thecorresponding requests, the storage controller issues request completioninformation (and associated data if data has been read from the storage)to the server's operating system and associated drivers. From this, theserver's operating system, associated drivers, and host processorgenerate corresponding request completion information and associateddata, and issue the corresponding request completion information andassociated data to the server's NW I/O device. The server's NW I/Odevice then issues the corresponding request completion information andassociated data to the client's NW I/O device.

Thus, in the foregoing conventional arrangement, the server's operatingsystem, associated drivers, and host processor process requests receivedby the server's NW I/O device, and the completion information and datafrom the storage. This may consume substantial amounts of operatingsystem and host processor processing bandwidth. It may also increase theamount of energy consumed and heat dissipated by the host processor.Furthermore, it may increase the latency involved in processing therequests issued by the client's NW I/O device. It is with respect tothese and other challenges that the examples described herein areneeded.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a first example system.

FIG. 2 illustrates an example completion queue element.

FIG. 3 illustrates a second example system.

FIG. 4 illustrates a third example system.

FIG. 5 illustrates example vendor defined message (VDM) formats.

FIG. 6 illustrates an example communication flow.

FIG. 7 illustrates an example block diagram for a first apparatus.

FIG. 8 illustrates an example of a first logic flow.

FIG. 9 illustrates an example of a first storage medium.

FIG. 10 illustrates an example network input/output device.

FIG. 11 illustrates an example block diagram for a second apparatus.

FIG. 12 illustrates an example of a second logic flow.

FIG. 13 illustrates an example of a second storage medium.

FIG. 14 illustrates an example Non-Volatile Memory Express (NVMe)controller.

DETAILED DESCRIPTION

As contemplated in the present disclosure, substantial amounts ofoperating system and host processor processing bandwidth may be consumedin a conventional arrangement between a client and a server when theclient attempts to access storage maintained by the server. Recently,servers are including both NW I/O devices and storage controllers havingenhanced capabilities that try to minimize operating system and hostprocessor involvement. For example, hardware elements such as commandsubmission and command completion queues may be utilized by a server'sNW I/O device and storage controllers to enable a remote client toaccess storage via a process known as remote direct memory access(RDMA).

Storage controllers are also being designed to operate in compliancewith relatively new interconnect communication protocols that may workwell with RDMA. Further, these storage controllers may control access tohard disk drives (HDDs) or solid state drives (SSDs). The SSDs mayinclude, but are not limited to, various types of non-volatile memorysuch as 3-dimensional cross-point memory, flash memory, ferroelectricmemory, silicon-oxide-nitride-oxide-silicon (SONOS) memory, polymermemory, nanowire, ferroelectric transistor random access memory (FeTRAMor FeRAM), nanowire or electrically erasable programmable read-onlymemory (EEPROM). In some examples, access to HDDs or SSDs may includeuse of interconnect communication protocols described in industrystandards or specifications (including progenies or variants) such asthe Peripheral Component Interconnect (PCI) Express Base Specification,revision 3.0, published in November 2010 (“PCI Express” or “PCIe”)and/or the Non-Volatile Memory Express (NVMe) Specification, revision1.1, published in October 2012.

Storage controllers that operate in compliance with the NVMeSpecification (“NVMe controllers”) may be capable of minimizingoperating system and host processor involvement when allowing a remoteclient to access storage such as an SSD or an HDD. These types of NVMecontrollers may not have built-in security checks to control access tothe SSD or HDD by the client. In some deployment, intimate knowledge ofthe design details of the storage controller may be needed bymanufacturers of NW I/O devices in order to set-up and then maintaincommunications with little to no operating system and host processorinvolvement. However, this may lead to some inflexibility to interchangedevices from a host computing platform. Also, operators may be limitedto utilize NW I/O devices and NVMe controllers that were made by thesame manufacturer that has the intimate knowledge needed for these typesof deployments. Since NW I/O devices and NVMe controllers may beseparately made by disparate types of manufactures (e.g., ones focusedon network communications and others focused on storage communications)the number of manufacturers making both types of devices may be limited.

Rather than require such detail of design, both the PCIe and the NVMeSpecification allow for the use of vendor defined messages forcommunications between devices operating in compliance with either ofthese specification. The vendor defined messages may be used to generateor create a type of generic interface for communications between NW I/Odevices and NVMe controllers to pass commands and completions betweenthese devices. It is with respect to these and other challenges that theexamples described herein are needed.

In some examples, techniques associated with use of vendor definedmessages to execute a command to access a storage device controlled byan NVMe controller maintained at a server may be implemented. For theseexamples, circuitry for a NW I/O device coupled to the server may becapable of supporting one or more components associated with receiving acommand for a client remote to the server to access the storage device.The one or more components may also be capable of including a firstvendor defined message with the command to cause the NVMe controller toexecute the command. The one or more components may then forward thecommand with the first vendor defined message to the NVMe controller. Acommand completion may be received from the NVMe controller having asecond vendor defined message. The second vendor defined message may beused by the one or more components to indicate a status of completion ofthe command to the client that originated the command.

FIG. 1 illustrates an example a first example system. As shown in FIG. 1the first example system includes a system 100 having a client 10 thatis communicatively coupled, via network 50, to server 20. According tosome examples, the terms “host computer,” “host,” “server,” “client,”“network node,” and “node” may be used interchangeably, and may mean,for example, without limitation, one or more end stations, mobileinternet devices, smart phones, media devices, input/output (I/O)devices, tablet computers, appliances, intermediate stations, networkinterfaces, clients, servers, and/or portions thereof. Although client10, server 20, and network 50 will be referred to in the singular, itshould be understood that each such respective component may comprise aplurality of such respective components without departing from theseexamples. According to some examples, a “network” may be or comprise anymechanism, instrumentality, modality, and/or portion thereof thatpermits, facilitates, and/or allows, at least in part, two or moreentities to be communicatively coupled together. Also in some examples,a first entity may be “communicatively coupled” to a second entity ifthe first entity is capable of transmitting to and/or receiving from thesecond entity one or more commands and/or data. Also, data andinformation may be used interchangeably, and may be or comprise one ormore commands (for example one or more program instructions), and/or oneor more such commands may be or comprise data and/or information. Alsofor these examples, an “instruction” may include data and/or one or morecommands.

Client 10 may include remote direct memory access (RDMA)-enabled networkinterface controller (RNIC) herein referred to as network (NW) I/Odevice 106 and/or one or more (and in the example shown in FIG. 1, aplurality of) buffers 13.

As shown in FIG. 1, server 20 may include one or more integrated circuit(IC) chips 180, memory 21, and/or storage 150. One or more chips 180 mayhave circuitry 118 that may include, a NW I/O device 108, amanageability module 109 or an NVMe controller 112. Although not shownin FIG. 1, in some examples NW I/O device 108 and/or NVMe controller 112may be separately attachable devices that couple to server 20 andinclude circuitry as described further below.

Also as shown in FIG. 1, the one or more chips 180 that may beincorporated within one or more multi-core host processors (HP) and/orcentral processing units (CPU) 12. Although not shown in the Figures,server 20 also may comprise one or more chipsets or devices to include,but not limited to memory or input/output controller circuitry). NW I/Odevice 108, NVMe controller 112, and/or HP/CPU 12 may be capable ofcommunicating with each other. Additionally, NW I/O device 108, NVMecontroller 112, manageability module 109 and/or HP/CPU 12 may be capableof accessing and/or communicating with one or more other components ofserver 20 (such as, memory 21 and/or storage 150), via one or more suchchipsets. In some examples, client 10 and/or NW I/O device 106 may beremote (e.g., geographically remote), at least in part, from server 20and/or NW I/O device 108.

According to some examples, “circuitry” may comprise, for example,singly or in any combination, analog circuitry, digital circuitry,hardwired circuitry, programmable circuitry, co-processor circuitry,state machine circuitry, and/or memory that may comprise programinstructions that may be executed by programmable circuitry. Also, insome examples, a processor, HP, CPU, processor core (PC), core, andcontroller each may comprise respective circuitry capable of performing,at least in part, one or more arithmetic and/or logical operations,and/or of executing, at least in part, one or more instructions. Anintegrated circuit chip may include one or more microelectronic devices,substrates, and/or dies. Although not shown in the FIG. 1, server 20 mayhave a graphical user interface system that may include, e.g., arespective keyboard, pointing device, and display system that may permita human user to input commands to, and monitor the operation of, server20 and/or system 100. Also, memory may comprise one or more of thefollowing types of memories: semiconductor firmware memory, programmablememory, non-volatile memory, read only memory, electrically programmablememory, random access memory, flash memory, magnetic disk memory,optical disk memory, and/or other or later-developed computer-readableand/or writable memory.

In some examples, storage 150 may include mass storage 156. For theseexamples, storage 150 may include one or more devices into, and/or fromwhich, data may be stored and/or retrieved, respectively. Also, forthese examples, mass storage may include storage capable of non-volatilestorage of data. For example, mass storage 156 may include, withoutlimitation, one or more non-volatile electro-mechanical, magnetic,optical, and/or semiconductor storage devices. These devices may includehard disk drives (HDDs) or solid state drives (SSDs). The SSDs may havenon-volatile types of memory such as 3-dimensional cross-point memory,flash memory, ferroelectric memory, silicon-oxide-nitride-oxide-silicon(SONOS) memory, polymer memory, nanowire, ferroelectric transistorrandom access memory (FeTRAM or FeRAM), nanowire or electricallyerasable programmable read-only memory (EEPROM).

According to some examples, manageability module 109, NVMe controller112, storage 150 or mass storage 156 may be capable of operating incompliance with the PCIe Specification and/or the NVMe Specification.

One or more machine-readable program instructions may be stored, atleast in part, in memory 21. In operation of server 20, thesemachine-readable instructions may be accessed and executed by one ormore host processors 12, NW I/O device 108, and/or NVMe controller 112.When executed by one or more HP 12, these one or more machine-readableinstructions may result in one or more operating system environments(OSE) 32 being executed at least in part by one or more HP 12, andbecoming resident at least in part in memory 21. Also when thesemachine-readable instructions are executed by NW I/O device 108 and/orNVMe controller 112, these one or more instructions may result in one ormore command interfaces 110 of NVMe controller 112, one or moredoorbells 192, one or more pointers 202, one or more agents 194, one ormore completion queues 124, and/or one or more submission queues 126being established and/or executed by NW I/O device 108 and/or NVMecontroller 112, and/or becoming resident in memory 21.

According to some examples, one or more OSE 32 may include one or moreoperating systems (OS) 31 and/or one or more NW I/O device and/or NVMecontroller drivers 33. These one or more drivers 33 may be mutuallydistinct from one or more OS 31, at least in part. Alternatively oradditionally, without departing from these examples, one or morerespective portions of one or more OS 32 and/or drivers 33 may not bemutually distinct, at least in part, from each other and/or may beincluded, at least in part, in each other. Likewise, without departingfrom these examples, circuitry 118, NW I/O device 108, manageabilitymodule 109 and/or NVMe controller 112 may be distinct from, oralternatively, may be included in the one or more not shown chipsetsand/or HP 12. Also without departing from these examples, one or moreportions of memory 21 may be included in or maintained at NW I/O device108, manageability module 109, NVMe controller 112, circuitry 118, HP12, and/or IC 180.

In some examples, a portion or subset of an entity may include all orless than all of the entity. Also, for these examples, a process,thread, daemon, program, driver, operating system, application, kernel,and/or virtual machine monitor each may (1) include, at least in part,and/or (2) result, at least in part, in and/or from, execution of one ormore operations and/or program instructions.

According to some examples, a command interface may facilitate, permit,and/or implement, at least in part, exchange, transmission, and/orreceipt of data and/or one or more commands. For these examples, aqueue, buffer, and/or doorbell may be one or more locations (e.g.,specified and/or indicated, at least in part, by one or more addresses)in memory in which data and/or one or more commands may be stored, atleast temporarily. Also, a queue element may include data and/or one ormore commands to be stored and/or stored in one or more queues, such as,for example, one or more descriptors and/or one or more commands.Additionally, for these examples, a pointer may indicate, address,and/or specify, at least in part, one or more locations and/or one ormore items in memory.

In some examples, NW I/O device 106 and NW I/O device 108 may exchangedata and/or commands via network 50 in accordance with one or moreprotocols that may comply and/or be compatible with an RDMA protocolsuch as Internet Wide Area RDMA protocol (iWARP), Infiniband (IB)protocol, Ethernet protocol, Transmission Control Protocol/InternetProtocol (TCP/IP) protocol and/or RDMA over Converged Ethernet (RoCE)protocol. For example, the iWARP protocol may comply and/or becompatible with Recio et al., “An RDMA Protocol Specification,” InternetDraft Specification, Internet Engineering Task Force (IETF), 21 Oct.2002. Also for example, the Ethernet protocol may comply and/or becompatible with Institute of Electrical and Electronics Engineers, Inc.(IEEE) Std. 802.3-2008, Dec. 26, 2008. Additionally, for example, theTCP/IP protocol may comply and/or be compatible with the protocolsdescribed in Internet Engineering Task Force (IETF) Request For Comments(RFC) 791 and 793, published September 1981. Also, the IB protocol maycomply and/or be compatible with Infiniband Architecture Specification,Vol. 2, Rel. 1.3, published November 2012. Additionally, for example,the RoCE protocol may comply and/or be compatible with Supplement toInfiniband Architecture Specification, Vol. 1, Rel. 1.2.1, Annex A16:“RDMA over Converged Ethernet (RoCE)”, published April 2010. Manydifferent, additional, and/or other protocols may be used for such dataand/or command exchange without departing from these examples (e.g.,earlier and/or later-developed versions of the aforesaid, related,and/or other protocols).

According to some examples, circuitry 118 may permit and/or facilitate,at least in part, NW I/O device 106's access, via NW I/O device 108, ofone or more command interfaces 110. For example, circuitry 118 maypermit and/or facilitate, at least in part, NW I/O device 106 being ableto so access one or more command interfaces 110 in a manner that isindependent of OSE 32 in server 20. This accessing may include, forexample, the writing of at least one queue element (e.g., one or morequeue elements (QE) 116) to one or more submission queues 114 in one ormore command interfaces 110. This may cause NW I/O device 108 forforward commands to NVMe controller 112 to perform, at least in part,one or more operations involving storage 150 and/or mass storage 156associated with NVMe controller 112. NVMe controller 112 may performthese one or more operations in response, at least in part, to the oneor more queue elements 116 (e.g., after and in response, at least inpart, to the one or more queue elements 116 being written into one ormore submission queues 114 and then forwarded by NW I/O device 108).These one or more operations involving storage 150 and/or mass storage156 may comprise one or more write operations and/or one or more readoperations involving, at least in part, storage 150 and/or mass storage156. For these examples, client 10 thus may be able to access storage150 and/or mass storage 156 via the one or more read operations and/orone or more write operations executed by NVMe controller 112.

By way of example, in operation of system 100, client 10 and/or NW I/Odevice 106 may authenticate client 10 and/or NW I/O device 106 to server20 and/or logic and/or features at NW I/O device 108. This may result inclient 10 and/or NW I/O device 106 being granted permission to access,at least in part, devices maintained at or controlled by elements ofserver 20 (e.g., via NW I/O device 108). Contemporaneously, after, orprior to this, at least in part, NW I/O device 108, NVMe controller 112,one or more agents 194, and/or OSE 32 may generate, establish, and/ormaintain, at least in part, in memory 21, one or more interfaces 110and/or one or more indicators 181 that may indicate, at least in part,where (e.g., one or more locations) in memory 21 one or more interfaces110 and/or the components thereof may be located. For example, one ormore indicators 181 may indicate, at least in part, one or morelocations in memory 21 where one or more submission queues 114, one ormore completion queues 120, one or more doorbells 170, and/or one ormore buffers 130A . . . 130N may be located. NW I/O device 108 mayprovide, via network 50, one or more indicators 181 to NW I/O device106. Thereafter, NW I/O device 106 may use one or more of the one ormore indicators 181 to access one or more command interfaces 110 and/orone or more components of the one or more command interfaces 110. One ormore indicators 181 may be or comprise, at least in part, one or morehandles (e.g., assigned to transaction contexts) for one or more regionsin memory 21, such as, in this embodiment, one or more service tags(STags) or transaction tags (TTags) that may comply and/or may becompatible with an RDMA (e.g., iWARP, IB, RoCE) protocol. In someexamples, the one or more regions in memory 21 may be included in one ormore bounce buffers maintained to facilitate remote access of storage150 or mass storage 156 by client 10.

After receiving one or more indicators 181, client 10 and/or NW I/Odevice 106 may issue one or more commands 105 to server 20, via network50 and NW I/O device 108, to NVMe controller 112 in a manner thatby-passes and/or is independent of the involvement of OSE 32. The one ormore commands 105 may command NVMe controller 112 to perform one or moreoperations involving storage 150 and/or mass storage 156.

According to some examples, one or more commands 105 may comply and/orbe compatible with an RDMA (e.g., iWARP, IB, RoCE) protocol. One or morecommands 105 may include and/or specify, at least in part, one or morequeue elements 116 that may embody and/or indicate, at least in part,the one or more operations involving storage 150 and/or mass storage 156that are being commanded. Although not shown in FIG. 1, one or morecommands 105 may comprise, specify, and/or indicate, at least in part,one or more of the indictors 181 that may indicate one or more locationsin one or more submission queues 114 as one or more intendeddestinations of one or more queue elements 116.

In some examples, one or more queue elements 116 may result in NW I/Odevice 108 forwarding a command to have NVMe controller 112 perform orexecute one or more write operations involving storage 150 and/or massstorage 156. Therefore, one or more commands 105 also may include and/orspecify, at least in part, data 199 to be written, as a result of NW I/Odevice 108 forwarding one or more queue elements 116 to NVMe controller112, to storage 150 and/or mass storage 156. One or more commands 105may include, specify, and/or indicate, at least in part, one or more ofthe indicators 181 that may indicate one or more locations of one ormore buffers (e.g., buffer(s) 13) to which data 199 is to be written (atleast temporarily) to a client 10.

In response, at least in part, to receipt of one or more commands 105,NW I/O device 108 may directly write (e.g., in accordance with RDMA(e.g., iWARP, IB, RoCE) protocol and/or in a manner that by-passesand/or is independent of OSE 32), in the manner commanded by one or morecommands 105, one or more queue elements 116 and data 199 to one or moresubmission queues 114 and one or more buffers 130A, respectively. Thus,in effect, by issuing one or more commands 105 to NW I/O device 108, NWI/O device 106 may write one or more queue elements 116 and data 199 toone or more submission queues 114 and one or more buffers 130A,respectively.

One or more commands 105 also may comprise and/or specify one or morevalues 201 and one or more of the indicators 181 that may indicate oneor more locations of one or more doorbells 170 to which one or morevalues 201 may be written. In response, at least in part, to these oneor more values 201 and these one or more of the indicators 181 in one ormore commands 105, NW I/O device 108 may directly write (e.g., inaccordance with RDMA (e.g., iWARP, IB, RoCE) protocol and/or in a mannerthat by-passes and/or is independent of OSE 32), in the manner commandedby one or more commands 105, one or more values 201 in doorbell 170. Thewriting of one or more values 201 in doorbell 170 may ring doorbell 170.Thus, in effect, by issuing one or more commands 105 to NW I/O device108, NW I/O device 106 may ring doorbell 170.

According to some examples, the ringing of a doorbell that is associatedwith an entity may comprise and/or involve, at least in part, thewriting one or more values to one or more memory locations (e.g.,associated with, comprising, and/or embodying the doorbell) that mayresult in and/or trigger, at least in part, the entity performing, atleast in part, one or more operations and/or actions. In some examples,the doorbells 170 and/or 192 may appear to CPU 12 and/or server 20 asone or more respective memory locations (not shown) in respective memory(not shown) in NVMe controller 112 and/or NW I/O device 108,respectively.

In response, at least in part, to the ringing of doorbell 170, NVMecontroller 112 may return to a fully operational state (e.g., if NVMecontroller 112 had previously entered a reduced power state relative tothis fully operational state), and may read one or more queue elements116 that were written into one or more submission queues 114. NVMecontroller 112 may then execute, at least in part, the one or morecommands that are specified and/or embodied by one or more queueelements 116. This may result in NVMe controller 112 performing, atleast in part, the one or more operations (e.g., one or more writes tostorage 150 and/or mass storage 156 of data 199 stored in one or morebuffers 130A) involving storage 150 and/or mass storage 156.

After completion, at least in part, of these one or more operationsinvolving storage 150 and/or mass storage 156, NVMe controller 112 maygenerate and write, at least in part, one or more completion queueelements (CQE) 129 to one or more completion queues 124. Also aftercompletion, at least in part, of these one or more operations involvingstorage 150 and/or mass storage 156, NVMe controller 112 ormanageability module 109 may write, at least in part, one or more valuesto one or more doorbells 192 associated with NW I/O device 108. This mayring one or more doorbells 192. In response, at least in part, to theringing of one or more doorbells 192, NW I/O device 108 may write (e.g.,via one or more RDMA write operations) one or more completion queueelements 190 to one or more completion queues 120 and then forward theone or more completion queue elements 190 to one or more buffers 13 inclient 10 (e.g., via one or more responses 197).

After one or more (e.g., several) such write and/or read operationsinvolving storage 150 and/or mass storage 156 have been performed, atleast in part, one or more agents 194 may carry out certain managementfunctions. For example, one or more agents 194 may establish, at leastin part, one or more submission queue entries/elements (E) 196A . . .196N in one or more submission queues 126 associated with NW I/O device108 and/or one or more submission queue entries/elements QE A . . . QE Nin table 250 (see FIG. 2). As is discussed below, these elements 196A .. . 196N and/or QE A . . . QE N, when executed, at least in part, by NWI/O device 108, may permit and/or facilitate copying or forwarding, atleast in part, of one or more other queue entries (e.g., one or moreNVMe controller 112 completion entries 129) to client 10 and/or NW I/Odevice 106 and/or data read by NVMe controller 112.

These management functions also may include the updating (e.g.,appropriately advancing), at least in part, by one or more agents 194 ofone or more pointers (e.g., ring pointers PNTR 202) associated with oneor more queue pairs (e.g., submission/completion queue pair 114, 120and/or submission/completion queue pair 126, 124) associated with the NWI/O controller 108 and the NVMe controller 112. This may permit newentries to the queue pairs to be stored at locations that will notresult in erroneous overwriting of other entries in the queue pairs.Additionally, as part of these management functions, the one or moreagents 194 may indicate one or more of the buffers 130A . . . 130N thatmay be available to be reused.

As another example, one or more queue elements 116 may command that NVMecontroller 112 perform one or more read operations involving storage 150and/or mass storage 156. Therefore, one or more commands 105 also mayinclude and/or specify, at least in part, one or more locations (e.g.,Namespaces) in storage 150 and/or mass storage 156 from which NVMecontroller 112 is to read data 199, as a result of executing one or morequeue elements 116.

In response, at least in part, to receipt of one or more commands 105,NW I/O device 108 may directly write (e.g., in accordance with an RDMA(e.g., iWARP, IB, RoCE) protocol and/or in a manner that by-passesand/or is independent of OSE 32), in the manner commanded by one or morecommands 105, one or more queue elements 116 to one or more submissionqueues 114. Thus, in effect, by issuing one or more commands 105 to NWI/O device 108, NW I/O device 106 may write one or more queue elements116 to one or more submission queues 114 and one or more buffers 130A,respectively.

In this example, one or more commands 105 also may comprise and/orspecify one or more values 201 and one or more of the indicators 181that may indicate one or more locations of one or more doorbells 170 towhich one or more values 201 are to be written. In response, at least inpart, to these one or more values 201 and these one or more of theindicators 181 in one or more commands 105, NW I/O device 108 maydirectly write (e.g., in accordance with an RDMA (e.g., iWARP, IB, RoCE)protocol and/or in a manner that by-passes and/or is independent of OSE32), in the manner commanded by one or more commands 105, one or morevalues 201 in doorbell 170. The writing of one or more values 201 indoorbell 170 may ring doorbell 170. Thus, in effect, by issuing one ormore commands 105 to NW I/O device 108, NW I/O device 106 may ringdoorbell 170.

In response, at least in part, to the ringing of doorbell 170, NVMecontroller 112 may return to a fully operational state (e.g., if NVMecontroller 112 had previously entered a reduced power state relative tothis fully operational state), and may read one or more queue elements116 that were written into one or more submission queues 114. NVMecontroller 112 then may execute, at least in part, the one or morecommands that are specified and/or embodied by one or more queueelements 116. This may result in NVMe controller 112 performing, atleast in part, the one or more operations (e.g., one or more reads ofstorage 150 and/or mass storage 156 to obtain data 199) involvingstorage 150 and/or mass storage 156 and storing data 199 in one or morebuffers (e.g., one or more buffers 130A).

After completion, at least in part, of these one or more operationsinvolving storage 150 and/or mass storage 156, NVMe controller 112 maygenerate and write, at least in part, one or more completion queueelements 129 to one or more completion queues 124. Also aftercompletion, at least in part, of these one or more operations involvingstorage 150 and/or mass storage 156, NVMe controller 112 also may write,at least in part, one or more values to one or more doorbells 192associated with NW I/O device 108. This may ring one or more doorbells192. In response, at least in part, to the ringing of one or moredoorbells 192, NW I/O device 108 may obtain queue elements 129 from theone or more completion queues 124 and forward or write one or morecompletion queue elements 190 to one or more completion queues 120 tofacilitate the transfer of data 199 (e.g., via on or more RDMA writeoperations with NW I/O device 106) to one or more buffers 13 in client10 (e.g., via one or more responses 197). Alternatively, manageabilitymodule 109 may obtain queue elements 129 from completion queues 124 andforward or write completion queue elements 190 to completion queues 120to facilitate the transfer of data 199 to buffers 13.

According to some examples, command interface 110 may be asynchronous inthat, for example, completion queue elements may not be stored in anorder in one or more completion queues 120 that corresponds to (1) theorder in which command queue elements are stored in the one or moresubmission queues 114, (2) the order in which such command queueelements are forwarded for execution and/or completion by the NVMecontroller 112, and/or (3) the order in which completion queue elements190 are stored in one or more completion queues 120 and/or provided toNW I/O device 106 and/or client 10. In operation, NW I/O device 106and/or client 10 may appropriately reorder, in the case of writecommands issued from the client 10 and/or NW I/O device 106,corresponding completion queue elements 190 received from NW I/O device108. However, in the case of read commands, in this embodiment, in orderto permit respective data read from storage 150 and/or mass storage 156to be appropriately associated with corresponding completion queueelements 190 for transmission to client 10 and/or NW I/O device 106,each completion queue element (e.g., completion queue element 190)resulting from completion indications placed in completion queues 120 byNW I/O device 108 may include the elements illustrated in FIG. 2.

As shown in FIG. 2, completion queue element (e.g., completion queueelement 190) may include one or more command parameters 304, one or morecommand queue identifiers 306, one or more command queue head positionindicators 308, status information 310, one or more queue phase bit (P)312, and/or one or more command identifiers 302. One or more commandparameters 304 may be and/or indicate one or more command specificparameters of the one or more queue elements 116 and/or commands 105that may correspond to and/or be associated with the one or morecompletion queue elements 190. One or more command queue identifiers 306may indicate and/or specify the one or more submission queues 114 towhich the one or more queue elements 116 were written. One or morecommand queue head position indicators 308 may indicate the currentposition (e.g., in the one or more submission queues 114 identified byone or more command queue identifiers 306) at which the one or morequeue elements 116 may be located. Status information 310 may indicatewhether the one or more commands 105 and/or one or more queue elements116 were successfully performed by the NVMe controller 112. One or morephase bits 312 may indicate whether the one or more completion queueelements 190 constitute the most recently added valid entry (e.g., toservice) in one or more completion queues 120. One or more commandidentifiers 302 may indicate, at least in part, and/or be identical toone or more corresponding command identifiers in the corresponding oneor more queue elements 116. Command identifiers 302 may permit one ormore completion queue elements 190 to be correctly associated with oneor more corresponding queue elements 116 and/or with the respective data199 read from the storage 150 and/or mass storage 156 as a result of theexecution of these one or more corresponding queue elements 116.

In some examples, one or more command identifiers 302 may be selected soas not to collide with and/or be identical to any other commandidentifiers that may be currently used by any completion queue elementsthat have not yet been provided to client 10 and/or NW I/O device 106 byNW I/O device 108. The command identifiers that may be used in system100 may be pre-calculated and/or pre-generated, and may be used asrespective indices INDEX A . . . INDEX N for respective entries ENTRY A. . . ENTRY N in a table 250 that may be stored, at least in part, inmemory 21. Each of the entries ENTRY A . . . ENTRY N in the table 250may store one or more respective pre-calculated and/or pre-generatedcommand queue elements QE A . . . QE N that may be associated with NWI/O device 108. Each respective element QE A . . . QE N may beassociated with one or more respective buffers in one or more buffers130A . . . 130N. Each of the buffers in one or more buffers 130A . . .130N into which NVMe controller 112 may store data read from storage 150and/or mass storage 156 also may be associated with one or morerespective submission identifiers used in system 100 and/or respectiveentries ENTRY A . . . ENTRY N.

The command queue elements QE A . . . QE N may be stored and/ormaintained in table 250 by client 10 and/or one or more agents 194. Ifone or more buffers 130A . . . 130N are statically allocated, table 250may be static, and may correspond in terms of, for example, allocationcharacteristics to one or more buffers 13 that may be allocated in theclient 10.

By way of example, after NVMe controller 112 reads data 199 from storage150 and/or mass storage 156, NVMe controller 112 may store the data 199in one or more buffers (e.g., one or more buffers 130A) that may beassociated with one or more command identifiers 302, and may send anindication to NW I/O device 108 that an access command has beencompleted, e.g., ringing one or more doorbells 192. In response, atleast in part, to NVMe controller 112 ringing one or more doorbells 192,NW I/O device 108 may determine, based at least in part upon one or morequeue phase bits 312, the one or more most recently added validcompletion queue in one or more completion queues 120. NW I/O device 108may use the one or more command identifiers 302 in one or morecompletion queue elements 190 to index into table 250 to locate the oneor more entries (e.g., one or more entries ENTRY A) and one or morecommand queue elements (e.g., one or more queue elements QE A) in table250 that may be associated with and/or identified, at least in part, byone or more command identifiers 302. NW I/O device 108 may execute, atleast in part, one or more commands that may be associated with and/orembodied by these one or more command queue elements QE A. This mayresult, at least in part, in NW I/O 108 reading one or more buffers 130Ato obtain data 199, and transmitting data 199 and one or more completionqueue elements 190 to NW I/O device 106 and/or client 10 (e.g., via oneor more responses 197). As a result, data 199 and/or one or morecompletion queue elements 190 may be copied into one or more clientbuffers 13.

Alternatively, in some examples, NW I/O device 108 may comprise, atleast in part, a state machine (not shown). This state machine may beindependent and/or separate from, at least in part, of one or moresubmission queues 114 that may be associated with and/or utilized by NWI/O device 108. This state machine may locate one or more command queueelements QE A in table 250 based at least in part upon one or morecommand identifiers 302, and may copy the one or more queue elements QEA into one or more corresponding submission queue elements 196A in oneor more submission queues 126. The state machine then may signal NW I/Odevice 108 to access and execute, at least in part, one or moresubmission queue elements 196A in one or more submission queues 126.

Further alternatively, without departing from these examples, prior tocompleting one or more read operations involving storage 150 and/or massstorage 156, NVMe controller 112 may locate and/or select, at least inpart, one or more queue elements QE A in and/or from table 250, based atleast in part upon one or more command identifiers 302. NVME controller112 then may write into one or more completion queue elements 190 intoone or more completion queues 120, and may write one or more queueelements QE A into one or more corresponding submission queue elements196A in one or more submission queues 126. NVME controller 112 then mayring one or more doorbells 192. This may result in NW I/O device 108accessing and executing, at least in part, one or more submission queueelements 196A in one or more submission queues 126. This may result, atleast in part, in NW I/O device 108 reading one or more buffers 130A toobtain data 199, and transmitting data 199 and one or more completionqueue elements 190 to NW I/O device 106 and/or client 10 (e.g., via oneor more responses 197). As a result, data 199 and/or one or morecompletion queue elements 190 may be copied into one or more clientbuffers 13.

In this alternative, firmware and/or one or more agents 194 executed, atleast in part, by NW I/O device 108, NVMe controller 112 ormanageability module 109 may maintain per-queue-pair context informationto indicate one or more queue pairs used for RDMA transactions. Thiscontext information also may include various pointers (e.g., to one ormore arrays of submission queue elements 196A . . . 196N to move datafrom one or more buffers 130A . . . 130N to one or more buffers 13,and/or the head of one or more submission queues 126), one or morelocations of one or more doorbells 192 and one or more values to ringthe one or more doorbells 192, and/or local copies of head and/orpointers to the one or more submission queues 126. Various of thesepointers (e.g., the head and tail pointers) may be dynamically updatedby firmware executed by NVMe controller 112.

Additionally or alternatively, without departing from these examples, NWI/O device 108, manageability module 109 and/or NVMe controller 112 maybe comprised, at least in part, in the not shown chipset, or in a notshown circuit board or device. Also additionally or alternatively,without departing from this embodiment, storage 150 and/or mass storage156 may be comprised, at least in part, internally in server 20 or beexternal to server 20.

Further although the foregoing description has been made with referenceto NW I/O device 108 being an RNIC, and NVMe controller 112 being anNVMe compliant storage controller, the principles of this embodiment maybe applied to circumstances in which protocols other than and/or inaddition to RDMA or NVMe may be employed, and/or in which NVMecontroller 112 may be involved in executing and/or facilitatingoperations that do not involve storage 150 (e.g., other and/oradditional input/output and/or communication-related operations).Accordingly, without departing from the above mentioned examples, NW I/Odevice 108 may utilize, and/or communications between client 10 andserver 20 may employ, protocols other than and/or in addition to RDMA.Also, without departing from this embodiment, NW I/O device 108,manageability module 109 or NVMe controller 112 may be involved inexecuting and/or may facilitate execution of such other and/oradditional operations that may employ protocols other than PCIe or NVMeprotocols. In these additional and/or alternative arrangements, hardwareand/or firmware circuitry (not shown) may be comprised in circuitry 118that may permit, at least in part, writing to doorbells 170 and/or 192via, e.g., one or more interrupt mechanisms (e.g., one or more messagesignaled interrupts (MSI/MSI-X) and/or other mechanisms). Thisembodiment should be viewed broadly as covering all such modifications,variations, and alternatives.

Thus, in some examples, circuitry may be arranged, at least in part, toenable a first NW I/O device in a client to access, via a second NW I/Odevice in a server that is remote from the client and in a manner thatis independent of an operating system environment in the server, atleast one command interface of another (e.g., storage, and/oranother/additional type of) controller of the server. The NW I/O devicein the client and the NW I/O device in the server may be or compriserespective remote direct memory access-enabled network interfacecontrollers (e.g., controllers capable, at least in part, of utilizingand/or communicating via RDMA). The command interface may include atleast one (e.g., storage, and/or other/additional type of) controllercommand queue. Such accessing may include writing at least one queueelement to the at least one submission queue to command the anothercontroller to perform at least one operation (e.g., involving storage,and/or involving one or more other and/or additional types ofoperations, such as, other and/or additional input/output operations)associated with the another controller (e.g., an NVMe controller). Theother controller may perform the at least one operation in response, atleast in part, to the at least one queue element. Many alternatives,variations, and modifications are possible. Some of these alternativesmay include the use of a manageability module (e.g., manageabilitymodule 109) coupled between the NW I/O device and the NVMe controller atthe server to facilitate the remote NW I/O device's access to the atleast one command interface.

Thus, in some examples, the one or more command interfaces 110 of NVMecontroller 112 in server 20 may be directly accessed by the client's NWI/O device 106 via one or more RDMA transactions, in a manner thatby-passes, is independent of, and/or does not involve the server's OSE32 and/or CPU 12. Advantageously, this may permit storage commands,data, and completion messages to be communicated between the client andserver much more quickly and efficiently, and with reduced latency.Furthermore, in this embodiment, interactions between NW I/O device 108and NVMe controller 112 may be carried out entirely or almost entirelyby hardware (e.g., utilizing peer-to-peer memory and doorbell writes),and also in a manner that by-passes, is independent of, and/or does notinvolve the server's OSE 32 and/or CPU 12. Advantageously, this maypermit such interactions to be carried out much more quickly andefficiently, and with reduce latency. Additionally, the above featuresof this embodiment may reduce the server's power consumption, heatdissipation, and the amount of bandwidth consumed by the OSE 32 and CPU12.

Many other modifications are possible. For example, as statedpreviously, in this embodiment, client 10 may comprise a plurality ofclients. If RDMA is employed for communications between server 20 andthe clients 10, in this embodiment, advantageously, the clients 10 maydynamically share buffers 130A . . . 130N, as a common pool of buffers,between or among the client 10 in carrying out their communications withserver 20, NW I/O device 108, and/or NVMe controller 112. In order topermit such buffer sharing, NW I/O device 108 may be capable ofmanipulating, adjusting, and/or modifying, at least in part,buffer-specifying information that may be indicated, at least in part,in commands 105 provided to the server 20 by the clients 10 in order toallow the buffers 130A . . . 130N and/or other server resources to beshared among the clients 10 without resulting in, for example,contention-related issues.

For example, the one or more indicators 181 and/or STags/TTags indicatedby the one or more indicators 181 may include respective informationthat NW I/O device 108 may associate with one or more buffers and/orbuffer pools in the buffers 130A . . . 130N, instead of and/or inaddition to one or more memory region handles. In this arrangement, theclients 10 may perform RDMA read operations utilizing such indicators181 and NW I/O device 108 may perform write operations to the one ormore buffers and/or buffer pools indicated by the respective informationand/or indicators 181. In carrying out its operations, NW I/O device 108may appropriately adjust the actual commands and/or command queueelements provided to NVMe controller 112 in order to result in thecorrect buffers, etc. being written to by NVMe controller 112 when NVMecontroller 112 carries out such commands and/or command queue elements.

Alternatively or additionally, without departing from the aboveexamples, NW I/O device 108 may include and/or be associated with ashared receive queue (not shown) to receive, for example, commands 105from multiple clients 10. NW I/O device 108 may be capable ofsubstituting, at least in part, one or more appropriate server bufferaddresses, values, and/or other information into one or more portions(e.g., queue elements 116, values 201, indicators 181, and/or otherinformation) of the received commands 105 to permit sharing of thestructures in the one or more command interfaces 110 between or amongmultiple clients 10, without resulting in contention or otherdegradation in performance. In this arrangement, the clients may not beprovided and/or utilize one or more STags to the storage controller'scommand queue and/or doorbell, and writing to these structures may beperformed by the server's NW I/O device 108. Advantageously, this maypermit multiple clients 10 that may be associated with and/or utilizethe shared receive queue to utilize and/or share, at least in part, thesame storage controller command queue, doorbell, and/or otherstructures.

For example, in the case of a write operation, one or more indicators181, one or more values 201, and/or other information in one or morecommands 105 may indicate, at least in part, one or more storagecontroller STags or TTags for the write operation (and relatedinformation), and/or one or more RDMA STags or TTags to one or morebuffers to which one or more completion queue elements may be written.Based at least in part upon the one or more received commands 105 and/orother information stored in NW I/O device 108, NW I/O device 108 mayselect one or more buffers in buffers 130A . . . 130N and one or morelocations in the submission queue 114 to which to post the data 199 tobe written and one or more corresponding command queue elements to beforwarded to submission queue 126 associated with NVMe controller 112.NW I/O device 108 may post the data 199 and the one or morecorresponding command queue elements in accordance with such selections,and thereafter, may ring doorbell 170. As posted by NW I/O device 108,the one or more command queue elements may indicate the one or morestorage controller STags or TTags supplied in the one or more commands105, command identifier 302, security context information (e.g., topermit validation of the one or more storage controller STags or TTags),and/or one or more STags/TTags to the one or more buffers to which data199 has been posted. After NVMe controller 112 has completed, at leastin part, the requested one or more write operations and posted one ormore completion queue elements (e.g., to completion queue 124), NVMecontroller 112 may ring doorbell 192. Based at least in part uponinformation in table 250, NW I/O device 108 may generate and forward tothe one or more clients that provided the received command 105 one ormore appropriate responses 197 via forwarding the completion queueelements from completion queue 124 to completion queue 120.

In the case of a read operation, generally analogous information may beprovided in command 105 and generally analogous operations may beperformed by NW I/O device 108 and/or NVMe controller 112. However, inthe case of a read operation, the data 199 read by NVMe controller 112may be stored by NVMe controller 112 to one or more of the buffers 130A. . . 130N specified by the NW I/O device 108, and may be read by the NWI/O device 108, instead of vice versa (e.g., as may be the case in awrite operation). NW I/O device 108 may transmit the read data 199 tothe one or more clients that provided the received command 105 in one ormore responses 197. In the foregoing arrangement, command 105 may besimilar or identical to a command that may be utilized by a client toaccess storage local to the client, at least from the vantage point ofone or more client-executed applications initiating such access.Advantageously, this may permit remote operations and/or RDMAtransactions of the types previously described to be substantiallytransparent to these one or more client-executed applications.

Thus, in some examples, advantageously, it may be possible for multipleclients to share the storage controller's command queue, doorbells,and/or the server's buffers, and/or to write to these structures (viathe server's NW I/O device) using an RDMA protocol, without sufferingfrom resource contention issues (and/or other disadvantages) that mightotherwise occur. The server's NW I/O device may be capable of modifying,at least in part, information associated with and/or comprised in theclients' commands 105 to facilitate such sharing and/or sharing of RDMASTag/TTag information between or among the clients. Advantageously, thismay permit RDMA protocol to be employed for command communication and/orcompletion information between the server and multiple clients, withimproved scalability, while reducing the memory consumption to implementsuch features, and without degradation in communication line rate.

FIG. 3 illustrates a second example system. As shown in FIG. 3, thesecond example includes a system 300. According to some examples, system300 may include multiple client nodes 310-1 to 310-n (where “n”represents any positive integer greater than 3) and a server 305. Forthese examples, a NW I/O device 330, an NVMe controller 350 and bouncebuffer(s) 360 may be located with and/or maintained at server 305.

In some examples, logic and/or features executed by circuitry for eithernetwork I/O device 330 and/or server 305 may allocate resources toclients 310-1 to 310-n to facilitate remote access to a storage device(not shown) controlled by NVMe controller 350. For these examples,separate I/O queue pairs (QPs) 320-1 to 320-n and separate NVMe QPs340-1 to 340-n may be allocated or assigned to clients 310-1 to 310-n,respectively. Also, at least portions of bounce buffer(s) 360 may beallocated or assigned to clients 310-1 to 310-n. I/O QPs 320-1 to 320-n,NVME QPs 340-1 to 340-n or bounce buffer(s) 360 may be part of systemmemory resident at server 305. Alternatively, I/O QPs 320-1 to 320-n maybe maintained at or within NW I/O device 330 and NVMe QP 340-1 to 340-nmay be maintained at or with NVMe controller 350.

According to some examples, I/O QPs 320-1 to 320-n may separatelyinclude both command submission queues and command completion queuesutilized by logic and/or features at NW I/O device 330 to exchangeinformation with clients 310-1 to 310-n regarding commands to access thestorage controlled by NVMe controller 350. Also, NVMe QPs 340-1 to 340-nmay separately include command submission queues and command completionqueues utilized by logic and/or features at NW I/O device 330 and atNVMe controller 350 to facilitate the relay of commands from clients310-1 to 310-n to NVMe controller 350. For these examples, NVMe QPs340-1 to 340-n are not directly accessible by clients 310-1 to 310-n.Since NVMe QPs 340-1 to 340-n are not directly accessible to clients310-1 to 310-n, logic and/or features at NW I/O device 330 may becapable of validating commands received from these clients before theyare forwarded or relayed to NVMe QPs 340-1 to 340-1 n.

Also, according to some examples, vendor defined messages may beexchanged between NW I/O device 330 and NVMe controller 350 tofacilitate the forwarding of command submissions and command completionsbetween I/O QPs 320-1 to 320-n and NVMe QPs 340-1 to 340-n. For theseexamples, both NW I/O device 330 and NVMe controller 350 may be capableof operating in compliance with the PCIe and/or NVMe Specifications. Thevendor defined messages may include, for example, flow controlinformation. These types of vendor defined messages may allow forformation of a generic-like interface for the exchange of commandsubmissions and command completions between NW I/O device 108 and NVMecontroller 350 without a need for detailed knowledge of how eachrespective device's QPs are arranged or configured. Vendor definedmessages may be exchanged via various reserved portions indicated forNVMe or PCIe compliant messages for such commands to include, but notlimited to, read, write, flush, write uncorrectable or compare commands.

In some examples, as described more below, logic and/or featuresexecuted by circuitry at NW I/O device 330 may receive a command from aclient such as client 310-1. For these examples, the command may to beaccess storage (not shown) controlled by NVMe controller 350. The logicand/or features at NW I/O device 330 may include a first vendor definedmessage with the command to cause NVMe controller 350 to execute thecommand. For example, the first vendor defined message may be based onflow control information exchanged between NW I/O device 330 and NVMecontroller 350. Based on the exchanged information, the first vendordefined message may identify one or more buffers from among bouncebuffer(s) 360 and a given number of credits representing availablebuffer capacity that may be consumed or used when the command is to beexecuted by NVMe controller 350. Following execution of the command,logic and/or features executed by circuitry at NVMe controller 350 maysend a command completion with a second vendor defined message. Thesecond vendor defined message may identify the one or more buffers and atotal credits available following completion of the command and thus mayinclude updated flow control information.

In some examples, logic and/or features at NW I/O device 330 may use thesecond vendor defined message to determine a status of the completion orexecution of the command and forward that determined status to client310-1. The logic and/or features at NW I/O device may forward thedetermined status by placing or writing queue elements in a commandcompletion queue included in I/O QP 320-1 and notify client 310-1 of thewriting as mentioned above for FIGS. 1 and 2.

According to some examples, the status determined via use of the secondvendor defined message may be based on whether the total creditsavailable indicate that the credits identified in the first vendordefined message were added back to the total credits. If added back, thestatus was a successful completion. If the total credits available donot indicate that the credits were added back, the status was anunsuccessful completion. An unsuccessful completion may prompt client310-1 to resend the command or to perform or initiate some sort of errorrecovery operation.

FIG. 4 illustrates a third example system. As shown in FIG. 4, the thirdexample system includes a system 400. According to some examples, system400 is similar to system 300 with the exception of a manageabilitymodule 470 situated between a NW I/O device 430 and an NVMe controller450. For these examples, rather than establishing a generic-typecommunication interface between a NW I/O device and an NVMe controller,the generic-type communication interface is established between the NWI/O device and a manageability module.

According to some examples, manageability module 470 may have adequatedetails about NVMe QPs 440-1 to 440-n to more effectively write commandsubmissions and retrieve command completions from these QPs compared toNW I/O device 430. For these examples, manageability module 470 and NVMecontroller 450 may be integrated on a same host platform for server 405.Meanwhile, NW I/O device 430 may be a detachable device that was notdesigned in an as integrated manner and thus may lack detailedinformation on the design of NVMe controller 450 and its associated NVMeQPs 440-1 to 440-n.

In some examples, logic and/or features executed by circuitry for NW I/Odevice 430 may receive a command from a client such as client 410-1. Forthese examples, the command may be to access storage (not shown)controlled by NVMe controller 450. The logic and/or features at NW I/Odevice 430 may include first vendor defined message with the commandthat may eventually cause NVMe controller 450 to execute the command.However, rather than directly exchanging the first vendor definedmessage with NVMe controller 450, manageability module 470 utilizes itsknowledge of NVMe QP 440-1 to serve as an intermediary between the twodevices and translate that knowledge to exchange information included ina vendor defined message such as flow control information with NW I/Odevice 430.

According to some examples, manageability module 470 may use the firstvendor defined message forwarded by NW I/O device 430 with the commandreceived from client 410-1 to a command submission queue included inNVMe QP 440-1. Manageability module 470 may then receive a commandcompletion message via a command completion queue included in NVMe QP440-1 and forward the command completion message to NW I/O device 430with a second vendor defined message. The logic and/or features at NWI/O device 430 may then use the second vendor defined message todetermine a status of the executed command. Similar to the processmentioned above for FIG. 3, the second vendor defined message mayinclude updated credit-based information that may be used by the logicand/or features at NW I/O device 430 to determine whether the commandwas successfully completed and forward the determined status to client410-1.

Although not shown in FIG. 3 or 4, in some examples, manageabilitymodule 470 and NVMe controller 450 may coexist on a same chip andcommunicates via vendor defined messages directly to NVMe controller 450yet communicates to NW I/O device 430 using legacy completion andsubmission queue pairs.

FIG. 5 illustrates example vendor defined message (VDM) formats 510 and520. In some examples, VSI formats 510 and 520 may be used to conveyfirst and second vendor define messages, respectively. For theseexamples, a first vendor defined message conveyed in VSI format 510 maybe included with a command forwarded from a NW I/O device. The firstvendor defined message may include flow control information and field512 may include one or more buffer IDs while field 514 may includecredits used when the command is to be eventually executed by an NVMecontroller. The second vendor defined message conveyed in VSI format 520may be included in command completion forwarded either directly from theNVMe controller that executed the command or from a manageability modulecoupled between the NVMe controller and the NW I/O device. The secondvendor defined message may include updated flow control information andfield 522 may include one or more buffer IDs while field 524 may includecredits available. The credits available included in field 524 may beused to determine whether the command was successfully completed orexecuted by the NVMe controller based on whether the credits availableindicate that the credits identified in the first vendor defined messagewere added back to the credits available.

FIG. 6 illustrates an example communication flow 600. In some examples,as shown in FIG. 6, communication flow 600 depicts examplecommunications between client 310-1 and server 305. For these examples,the communications may be compatible with an RDMA (e.g., iWARP, IB,RoCE) protocol.

In some examples, starting from the top of FIG. 6, the first line “RDMAwrite (S-Stag)(Data)” may be an RDMA Write message carrying transactiondata. The second line “RDMA Send (Command) (C-tag, S-Stag)” may be anRDMA Send message from client 310-1 that may include the command foraccess to the storage controlled by NVMe controller 350. The third line“RDMA Write (C-Stag) (Data)” may be an RDMA write message to carry datafrom a read of the storage and targeting a buffer (e.g., identified byC-Stag) maintained at client 310-1 that was originally indicated in aread request command. The fourth line “RDMA Send SE (Completion)” may bean RDMA Send message indicating that a solicited event (SE) such as aread command was completed by NVMe controller 350.

According to some examples, as shown in FIG. 6, solid lines may berelated to all operation codes (Opcodes), dotted lines may be related toread only Opcodes and dashed lines may be related to write only Opcodesthat may include write or compare commands.

FIG. 7 illustrates an example block diagram of a first apparatus. Asshown in FIG. 7, the first apparatus includes apparatus 700. Althoughapparatus 700 shown in FIG. 7 has a limited number of elements in acertain topology, it may be appreciated that the apparatus 700 mayinclude more or less elements in alternate topologies as desired for agiven implementation.

The apparatus 700 may be supported by circuitry 720 maintained at anetwork I/O device coupled to a server. Circuitry 720 may be arranged toexecute one or more software or firmware implemented components 722-a.It is worthy to note that “a” and “b” and “c” and similar designators asused herein are intended to be variables representing any positiveinteger. Thus, for example, if an implementation sets a value for a=3,then a complete set of software or firmware for components 722-a mayinclude components 722-1, 722-2 or 722-3. The examples presented are notlimited in this context and the different variables used throughout mayrepresent the same or different integer values.

According to some examples, circuitry 720 may include a processor orprocessor circuitry. The processor or processor circuitry can be any ofvarious commercially available processors, including without limitationan AMD® Athlon®, Duron® and Opteron® processors; ARM® application,embedded and secure processors; IBM® and Motorola® DragonBall® andPowerPC® processors; IBM and Sony® Cell processors; Intel® Atom®,Celeron®, Core (2) Duo®, Core i3, Core i5, Core i7, Itanium®, Pentium®,Xeon®, Xeon Phi® and XScale® processors; and similar processors.According to some examples circuitry 720 may also be an applicationspecific integrated circuit (ASIC) and at least some components 722-amay be implemented as hardware elements of the ASIC.

According to some examples, apparatus 700 may include a receivecomponent 722-1. Receive component 722-1 may be capable of receivingcommand(s) 705 via messages in an RDMA compliant (e.g., iWARP, IB, RoCE)protocol. Command(s) 705 may have been sent from remote clients to aserver. For these examples, the server may be coupled to a NW I/O devicehaving an apparatus 700. Command(s) 705 may include commands to accessstorage controlled by an NVMe controller located at or with the server.Receive component 722-1 may be capable of at least temporarily storingprotocol information 724-a (e.g., in a data structure such as a lookuptable (LUT)) in order to interpret or decode at least portions ofcommand(s) 705. Receive component 722-1 may also be capable of receivingcompletion(s) 710 that may include indications of completions ofcommands forwarded to the NW I/O device as well as a vendor definedmessage that may have been forwarded with the completion(s) 710, e.g.updated flow control information. Receive component 722-1 may alsoobtain PCIe or NVMe protocol information from protocol information 724-ato interpret or decode completion(s) 710.

In some examples, apparatus 700 may also include an informationcomponent 722-2. Information component 722-2 may be capable of includinga first vendor defined message with command(s) 705 received by receivecomponent 722-1. The first vendor defined message may be obtained fromor based on vendor defined information 726-b that may be stored in adata structure such as a LUT. Vender specific information 726-b may bebased on information exchanged with either a manageability module or theNVMe controller that will eventually execute command(s) 705. Thatinformation may include flow control information. Information component722-2 may also be capable of interpreting a second vendor definedmessage based on vendor defined information 726-b to determine a statusof a completion included in completion(s) 710 received by receivecomponent 722-1 following completion of command(s) 705 by the NVMecontroller. The second vendor defined message received withcompletion(s) 710 may be in a message in the example format of VSIformat 520.

In some examples, apparatus 700 may also include a forward component722-3. Forward component 722-3 may be capable of forwarding command(s)705 with the first vendor defined message to or towards the NVMecontroller. For these examples, the first vendor defined message may beforwarded in a message in the example format of VSI format 510. Forwardcomponent 722-3 may also be capable of forwarding status 745 to theclient that originally sent command(s) 705. Status 745, for example, mayindicate the status of command(s) 705 based on the second vendor definedmessage received with completion(s) 710. Forward component 722-3 may becapable of at least temporarily storing protocol information 724-a(e.g., in an LUT) in order to encode command(s) 705 in PCIe or NVMecompliant format to be sent to or towards the NVMe controller or toencode at least portions of status 745 in an RDMA compliant (e.g.,iWARP, IB, RoCE) protocol to be sent to the client that originatedcommand(s) 705.

Included herein is a set of logic flows representative of examplemethodologies for performing novel aspects of the disclosedarchitecture. While, for purposes of simplicity of explanation, the oneor more methodologies shown herein are shown and described as a seriesof acts, those skilled in the art will understand and appreciate thatthe methodologies are not limited by the order of acts. Some acts may,in accordance therewith, occur in a different order and/or concurrentlywith other acts from that shown and described herein. For example, thoseskilled in the art will understand and appreciate that a methodologycould alternatively be represented as a series of interrelated states orevents, such as in a state diagram. Moreover, not all acts illustratedin a methodology may be required for a novel implementation.

A logic flow may be implemented in software, firmware, and/or hardware.In software and firmware embodiments, a logic flow may be implemented bycomputer executable instructions stored on at least one non-transitorycomputer readable medium or machine readable medium, such as an optical,magnetic or semiconductor storage. The embodiments are not limited inthis context.

FIG. 8 illustrates an example of a first logic flow. As shown in FIG. 8,the first logic flow includes logic flow 800. Logic flow 800 may berepresentative of some or all of the operations executed by one or morelogic, features, or devices described herein, such as apparatus 700.More particularly, logic flow 800 may be implemented by receivecomponent 722-1, information component 722-2 or forward component 722-3.

According to some examples, logic flow 800 at block 802 may receive acommand from a client to access to a storage device controlled by anNVMe controller maintained at a server. For example, command(s) 705 maybe received by receive component 722-1 included in an apparatus 700 fora NW I/O device coupled to the server.

In some examples, logic flow 800 at block 804 may include a first vendordefined message with the command to cause the NVMe controller to executethe command. For example, information component 722-2 may useinformation included in vendor defined information 726-b to convey thefirst vendor defined message in a message in the example format of VSIformat 510 that includes flow control information.

According to some examples, logic flow 800 at block 806 may then forwardthe command with the first vendor defined message to the NVMecontroller. For these examples, forward component 722-3 may forwardcommand(s) 705 with the first vendor defined message in a message in theexample format of VSI format 510 that includes flow control information.In some examples, an intermediary such as a manageability module mayreceive command(s) 705 with the first vendor defined message and may usethe first vendor defined message to cause the NVMe to execute command(s)705. In other examples, the NVMe may directly receive command(s) 705with the first vendor defined message and based, at least in part, onthe first vendor defined message the NVMe may execute command(s) 705.

In some examples, logic flow 800 at block 806 may receive a commandcompletion message with a second vendor defined message from the NVMecontroller. Also at block 806, logic flow 800 may forward a status ofthe executed command to the client based, at least in part, on thesecond vendor defined message. For these examples, completion(s) 710with second vendor defined message may be received by receive component722-1. Also, information component 722-2 may interpret the second vendordefined message to determine a status of completion of command(s) 705using vendor defined information 726-b. The second vendor definedmessage received with completion(s) 710 may include updated credit-basedflow information that may be used by information component 722-2 todetermine whether command(s) 705 were successfully completed. Forwardcomponent 722-3 may then forward status 745 to the client that mayindicate the determined status.

FIG. 9 illustrates an example of a first storage medium. As shown inFIG. 9, the first storage medium includes storage medium 900. Storagemedium 900 may comprise an article of manufacture. In some examples,storage medium 900 may include any non-transitory computer readablemedium or machine readable medium, such as an optical, magnetic orsemiconductor storage. Storage medium 900 may store various types ofcomputer executable instructions, such as instructions to implementlogic flow 800. Examples of a computer readable or machine readablestorage medium may include any tangible media capable of storingelectronic data, including volatile memory or non-volatile memory,removable or non-removable memory, erasable or non-erasable memory,writeable or re-writeable memory, and so forth. Examples of computerexecutable instructions may include any suitable type of code, such assource code, compiled code, interpreted code, executable code, staticcode, dynamic code, object-oriented code, visual code, and the like. Theexamples are not limited in this context.

FIG. 10 illustrates an example NW I/O device 1000. In some examples, asshown in FIG. 10, NW I/O device 1000 may include a processing component1040, other platform components or a communications interface 1060.According to some examples, NW I/O device 1000 may be implemented in aNW I/O device coupled to a server in a system or data center asmentioned above.

According to some examples, processing component 1040 may executeprocessing operations or logic for apparatus 700 and/or storage medium900. Processing component 1040 may include various hardware elements,software elements, or a combination of both. Examples of hardwareelements may include devices, logic devices, components, processors,microprocessors, circuits, processor circuits, circuit elements (e.g.,transistors, resistors, capacitors, inductors, and so forth), integratedcircuits, application specific integrated circuits (ASIC), programmablelogic devices (PLD), digital signal processors (DSP), field programmablegate array (FPGA), memory units, logic gates, registers, semiconductordevice, chips, microchips, chip sets, and so forth. Examples of softwareelements may include software components, programs, applications,computer programs, application programs, device drivers, systemprograms, software development programs, machine programs, operatingsystem software, middleware, firmware, software modules, routines,subroutines, functions, methods, procedures, software interfaces,application program interfaces (API), instruction sets, computing code,computer code, code segments, computer code segments, words, values,symbols, or any combination thereof. Determining whether an example isimplemented using hardware elements and/or software elements may vary inaccordance with any number of factors, such as desired computationalrate, power levels, heat tolerances, processing cycle budget, input datarates, output data rates, memory resources, data bus speeds and otherdesign or performance constraints, as desired for a given example.

In some examples, other platform components 1050 may include commoncomputing elements, such as one or more processors, multi-coreprocessors, co-processors, memory units, chipsets, controllers,peripherals, interfaces, oscillators, timing devices, and so forth.Examples of memory units may include without limitation various types ofcomputer readable and machine readable storage media in the form of oneor more higher speed memory units, such as ROM, RAM, DRAM, DDRAM, SDRAM,SRAM, PROM, EPROM, EEPROM, flash memory or any other type of storagemedia suitable for storing information.

In some examples, communications interface 1060 may include logic and/orfeatures to support a communication interface. For these examples,communications interface 1060 may include one or more communicationinterfaces that operate according to various communication protocols orstandards to communicate over direct or network communication links.Direct communications may occur via use of communication protocols orstandards described in one or more industry standards (includingprogenies and variants) such as those associated with the PCIespecification, the NVMe specification, the RDMA Protocol specification,the IEEE 802-2-2008 specification, RFC 791 or RFC 793.

The components and features of NW I/O device 1000 may be implementedusing any combination of discrete circuitry, application specificintegrated circuits (ASICs), logic gates and/or single chiparchitectures. Further, the features of NW I/O device 1000 may beimplemented using microcontrollers, programmable logic arrays and/ormicroprocessors or any combination of the foregoing where suitablyappropriate. It is noted that hardware, firmware and/or softwareelements may be collectively or individually referred to herein as“logic” or “circuit.”

It should be appreciated that the exemplary NW I/O device 1000 shown inthe block diagram of FIG. 10 may represent one functionally descriptiveexample of many potential implementations. Accordingly, division,omission or inclusion of block functions depicted in the accompanyingfigures does not infer that the hardware components, circuits, softwareand/or elements for implementing these functions would necessarily bedivided, omitted, or included in embodiments.

FIG. 11 illustrates an example block diagram of a second apparatus. Asshown in FIG. 11, the second apparatus includes apparatus 1100. Althoughapparatus 1100 shown in FIG. 11 has a limited number of elements in acertain topology, it may be appreciated that the apparatus 1100 mayinclude more or less elements in alternate topologies as desired for agiven implementation.

The apparatus 700 may be supported by circuitry 1120 maintained at anNVMe controller located at or with a server. Circuitry 1120 may bearranged to execute one or more software or firmware implementedcomponents 1122-a. It is worthy to note that “a” and “b” and “c” andsimilar designators as used herein are intended to be variablesrepresenting any positive integer. Thus, for example, if animplementation sets a value for a=5, then a complete set of software orfirmware for components 1122-a may include components 1122-1, 1122-2 or1122-3. The examples presented are not limited in this context and thedifferent variables used throughout may represent the same or differentinteger values.

According to some examples, circuitry 1120 may include a processor orprocessor circuitry. The processor or processor circuitry can be any ofvarious commercially available processors, including without limitationan AMD® Athlon®, Duron® and Opteron® processors; ARM® application,embedded and secure processors; IBM® and Motorola® DragonBall® andPowerPC® processors; IBM and Sony® Cell processors; Intel® Atom®,Celeron®, Core (2) Duo®, Core i3, Core i5, Core i7, Itanium®, Pentium®,Xeon®, Xeon Phi® and XScale® processors; and similar processors.According to some examples circuitry 1120 may also be an applicationspecific integrated circuit (ASIC) and at least some components 1122-amay be implemented as hardware elements of the ASIC.

According to some examples, apparatus 1100 may include a receivecomponent 1122-1. Receive component 1122-1 may be capable of receivingcommand(s) 1110 via messages in a PCIe and/or NVMe compliant protocol.Command(s) 1110 may have been originally sent from remote clients to theserver and then forwarded by a NW I/O device. Command(s) 1110 may havebeen forwarded with first vendor defined message. Command(s) 1110 mayinclude commands to access storage controlled by an NVMe controllerhaving an apparatus 1100. Receive component 1122-1 may be capable of atleast temporarily storing protocol information 1124-a (e.g., in a datastructure such as a lookup table (LUT)) in order to interpret or decodeat least portions of command(s) 1110.

In some examples, apparatus 1100 may also include an execution component1122-2. Execution component 1122-2 may be capable of executingcommand(s) 1110 received by receive component 1122-1 based, at least inpart, on the first vendor defined message forwarded with command(s)1110. The first vendor defined message may be obtained from or based onvendor defined information 1126-b that may be stored in a data structuresuch as a LUT. Vender specific information 1126-b may be based oninformation exchanged with the NW I/O device. That information mayinclude flow control information.

In some examples, apparatus 1100 may also include a send component1122-3. Send component 1122-3 may be capable of sending completion(s)1130 with the second vendor defined message to the NW I/O device. Forthese examples, the second vendor defined message may be forwarded in amessage in the example format of VSI format 520. Forward component1122-3 may be capable of at least temporarily storing protocolinformation 1124-a (e.g., in an LUT) in order to encode completion(s)1130 in PCIe or NVMe compliant format or to be sent to the I/O NWdevice.

FIG. 12 illustrates an example of a second logic flow. As shown in FIG.12, the second logic flow included logic flow 1200. Logic flow 1200 maybe representative of some or all of the operations executed by one ormore logic, features, or devices described herein, such as apparatus1100. More particularly, logic flow 1200 may be implemented by receivecomponent 1122-1, execution component 1122-2 or send component 1122-3.

According to some examples, logic flow 1200 at block 1202 may receive,at an NVMe controller, a command forwarded by a NW I/O device. Also atblock 1202, the command may be for a remote client to access a storagedevice controlled by the NVMe controller. For example, command(s) 1110may be received by receive component 1122-1 included in an apparatus1100 for an NVMe controller maintained at the server.

In some examples, logic flow 1200 at block 1204 may execute the commandbased, at least in part, on first vendor defined message included withthe command. For example, execution component 1122-2 may use informationincluded in vendor defined information 1126-b to interpret the firstvendor defined message that may include flow control information in amessage received in VSI format 510

According to some examples, logic flow 1200 at block 1206 may then senda command completion message with second vendor defined message to theNW I/O device. For these examples, send component 1122-3 may sendcompletion(s) 1130 with the second vendor defined message in a messagein the example format of VSI format 520 that include updated flowcontrol information. The NW I/O device may use the second vendor definedmessage to determine a status of the completion of command(s) 1110 andthen forward the status to the client that originated command(s) 1110.

FIG. 13 illustrates an example of a second storage medium. As shown inFIG. 13, the second storage medium includes storage medium 1300. Storagemedium 1300 may comprise an article of manufacture. In some examples,storage medium 1300 may include any non-transitory computer readablemedium or machine readable medium, such as an optical, magnetic orsemiconductor storage. Storage medium 1300 may store various types ofcomputer executable instructions, such as instructions to implementlogic flow 1200. Examples of a computer readable or machine readablestorage medium may include any tangible media capable of storingelectronic data, including volatile memory or non-volatile memory,removable or non-removable memory, erasable or non-erasable memory,writeable or re-writeable memory, and so forth. Examples of computerexecutable instructions may include any suitable type of code, such assource code, compiled code, interpreted code, executable code, staticcode, dynamic code, object-oriented code, visual code, and the like. Theexamples are not limited in this context.

FIG. 14 illustrates an example NVMe controller 1400. In some examples,as shown in FIG. 14, NVMe controller 1400 may include a processingcomponent 1440, other platform components or a communications interface1460. According to some examples, NVMe controller 1400 may beimplemented in a controller coupled to or maintained at a server in asystem or data center as mentioned above.

According to some examples, processing component 1440 may executeprocessing operations or logic for apparatus 1100 and/or storage medium1300. Processing component 1440 may include various hardware elements,software elements, or a combination of both. Examples of hardwareelements may include devices, logic devices, components, processors,microprocessors, circuits, processor circuits, circuit elements (e.g.,transistors, resistors, capacitors, inductors, and so forth), integratedcircuits, application specific integrated circuits (ASIC), programmablelogic devices (PLD), digital signal processors (DSP), field programmablegate array (FPGA), memory units, logic gates, registers, semiconductordevice, chips, microchips, chip sets, and so forth. Examples of softwareelements may include software components, programs, applications,computer programs, application programs, device drivers, systemprograms, software development programs, machine programs, operatingsystem software, middleware, firmware, software modules, routines,subroutines, functions, methods, procedures, software interfaces,application program interfaces (API), instruction sets, computing code,computer code, code segments, computer code segments, words, values,symbols, or any combination thereof. Determining whether an example isimplemented using hardware elements and/or software elements may vary inaccordance with any number of factors, such as desired computationalrate, power levels, heat tolerances, processing cycle budget, input datarates, output data rates, memory resources, data bus speeds and otherdesign or performance constraints, as desired for a given example.

In some examples, other platform components 1450 may include commoncomputing elements, such as one or more processors, multi-coreprocessors, co-processors, memory units, chipsets, controllers,peripherals, interfaces, oscillators, timing devices, and so forth.Examples of memory units may include without limitation various types ofcomputer readable and machine readable storage media in the form of oneor more higher speed memory units, such as ROM, RAM, DRAM, DDRAM, SDRAM,SRAM, PROM, EPROM, EEPROM, flash memory or any other type of storagemedia suitable for storing information.

In some examples, communications interface 1460 may include logic and/orfeatures to support a communication interface. For these examples,communications interface 1460 may include one or more communicationinterfaces that operate according to various communication protocols orstandards to communicate over communication links. Communications mayoccur via use of communication protocols or standards described in oneor more industry standards (including progenies and variants) such asthose associated with the PCIe specification or the NVMe specification.

The components and features of NVMe controller 1400 may be implementedusing any combination of discrete circuitry, application specificintegrated circuits (ASICs), logic gates and/or single chiparchitectures. Further, the features of NVMe controller 1400 may beimplemented using microcontrollers, programmable logic arrays and/ormicroprocessors or any combination of the foregoing where suitablyappropriate. It is noted that hardware, firmware and/or softwareelements may be collectively or individually referred to herein as“logic” or “circuit.”

It should be appreciated that the exemplary NVMe controller 1400 shownin the block diagram of FIG. 14 may represent one functionallydescriptive example of many potential implementations. Accordingly,division, omission or inclusion of block functions depicted in theaccompanying figures does not infer that the hardware components,circuits, software and/or elements for implementing these functionswould necessarily be divided, omitted, or included in embodiments.

One or more aspects of at least one example may be implemented byrepresentative instructions stored on at least one machine-readablemedium which represents various logic within the processor, which whenread by a machine, computing device or system causes the machine,computing device or system to fabricate logic to perform the techniquesdescribed herein. Such representations, known as “IP cores” may bestored on a tangible, machine readable medium and supplied to variouscustomers or manufacturing facilities to load into the fabricationmachines that actually make the logic or processor.

Various examples may be implemented using hardware elements, softwareelements, or a combination of both. In some examples, hardware elementsmay include devices, components, processors, microprocessors, circuits,circuit elements (e.g., transistors, resistors, capacitors, inductors,and so forth), integrated circuits, application specific integratedcircuits (ASIC), programmable logic devices (PLD), digital signalprocessors (DSP), field programmable gate array (FPGA), memory units,logic gates, registers, semiconductor device, chips, microchips, chipsets, and so forth. In some examples, software elements may includesoftware components, programs, applications, computer programs,application programs, system programs, machine programs, operatingsystem software, middleware, firmware, software modules, routines,subroutines, functions, methods, procedures, software interfaces,application program interfaces (API), instruction sets, computing code,computer code, code segments, computer code segments, words, values,symbols, or any combination thereof. Determining whether an example isimplemented using hardware elements and/or software elements may vary inaccordance with any number of factors, such as desired computationalrate, power levels, heat tolerances, processing cycle budget, input datarates, output data rates, memory resources, data bus speeds and otherdesign or performance constraints, as desired for a givenimplementation.

Some examples may include an article of manufacture or at least onecomputer-readable medium. A computer-readable medium may include anon-transitory storage medium to store logic. In some examples, thenon-transitory storage medium may include one or more types ofcomputer-readable storage media capable of storing electronic data,including volatile memory or non-volatile memory, removable ornon-removable memory, erasable or non-erasable memory, writeable orre-writeable memory, and so forth. In some examples, the logic mayinclude various software elements, such as software components,programs, applications, computer programs, application programs, systemprograms, machine programs, operating system software, middleware,firmware, software modules, routines, subroutines, functions, methods,procedures, software interfaces, API, instruction sets, computing code,computer code, code segments, computer code segments, words, values,symbols, or any combination thereof.

According to some examples, a computer-readable medium may include anon-transitory storage medium to store or maintain instructions thatwhen executed by a machine, computing device or system, cause themachine, computing device or system to perform methods and/or operationsin accordance with the described examples. The instructions may includeany suitable type of code, such as source code, compiled code,interpreted code, executable code, static code, dynamic code, and thelike. The instructions may be implemented according to a predefinedcomputer language, manner or syntax, for instructing a machine,computing device or system to perform a certain function. Theinstructions may be implemented using any suitable high-level,low-level, object-oriented, visual, compiled and/or interpretedprogramming language.

Some examples may be described using the expression “in one example” or“an example” along with their derivatives. These terms mean that aparticular feature, structure, or characteristic described in connectionwith the example is included in at least one example. The appearances ofthe phrase “in one example” in various places in the specification arenot necessarily all referring to the same example.

Some examples may be described using the expression “coupled” and“connected” along with their derivatives. These terms are notnecessarily intended as synonyms for each other. For example,descriptions using the terms “connected” and/or “coupled” may indicatethat two or more elements are in direct physical or electrical contactwith each other. The term “coupled,” however, may also mean that two ormore elements are not in direct contact with each other, but yet stillco-operate or interact with each other.

It is emphasized that the Abstract of the Disclosure is provided tocomply with 37 C.F.R. Section 1.72(b), requiring an abstract that willallow the reader to quickly ascertain the nature of the technicaldisclosure. It is submitted with the understanding that it will not beused to interpret or limit the scope or meaning of the claims. Inaddition, in the foregoing Detailed Description, it can be seen thatvarious features are grouped together in a single example for thepurpose of streamlining the disclosure. This method of disclosure is notto be interpreted as reflecting an intention that the claimed examplesrequire more features than are expressly recited in each claim. Rather,as the following claims reflect, inventive subject matter lies in lessthan all features of a single disclosed example. Thus the followingclaims are hereby incorporated into the Detailed Description, with eachclaim standing on its own as a separate example. In the appended claims,the terms “including” and “in which” are used as the plain-Englishequivalents of the respective terms “comprising” and “wherein,”respectively. Moreover, the terms “first,” “second,” “third,” and soforth, are used merely as labels, and are not intended to imposenumerical requirements on their objects.

Although the subject matter has been described in language specific tostructural features and/or methodological acts, it is to be understoodthat the subject matter defined in the appended claims is notnecessarily limited to the specific features or acts described above.Rather, the specific features and acts described above are disclosed asexample forms of implementing the claims.

What is claimed is:
 1. An apparatus comprising: circuitry for a networkinput/output device coupled to a server; a receive component forexecution by the circuitry to receive a command for a client remote tothe server to access a storage device controlled by a Non-VolatileMemory Express (NVMe) controller maintained at the server; aninformation component for execution by the circuitry to include a firstvendor defined message with the command to cause the NVMe controller toexecute the command; and a forward component for execution by thecircuitry to forward the command with the first vendor defined messageto the NVMe controller.
 2. The apparatus of claim 1, the networkinput/output device, the storage device and the NVMe controller arrangedto operate in compliance with an industry standard to include PCIe BaseSpecification, revision 3.0 or NVMe Specification, revision 1.1.
 3. Theapparatus of claim 2, the NVMe controller to directly receive thecommand and execute the command based, at least in part, on the firstvendor defined message, the receive component to receive a commandcompletion message from the NVMe controller with a second vendor definedmessage and the forward component to forward a status of the executedcommand to the client based, at least in part, on the second vendordefined message.
 4. The apparatus of claim 3, the first vendor definedmessage comprises flow control information exchanged between the networkinput/output device and the NVMe controller and the second vendordefined message comprises updated flow control information.
 5. Theapparatus of claim 1, the command with the first vendor defined messagereceived by a manageability module coupled between the networkinput/output device and the NVMe controller, the manageability module touse the first vendor defined message to forward the command to the NVMecontroller via a command submission queue maintained by the NVMecontroller, the manageability module to receive a command completionmessage via a command completion queue maintained by the NVMe controllerand forward the command completion message to the network input/outputdevice with a second vendor defined message, the receive component toreceive the command completion message with the second vendor definedmessage and the forward component to forward a status of the executedcommand to the client based, at least in part, on the second vendordefined message.
 6. The apparatus of claim 5, the network input/outputdevice, the manageability module, the storage device and the NVMecontroller arranged to operate in compliance with an industry standardto include PCIe Base Specification, revision 3.0 or NVMe Specification,revision 1.1, the first vendor defined message comprises flow controlinformation exchanged between network input/output device and themanageability module and the second vendor defined message comprisesupdated flow control information.
 7. The apparatus of claim 1, thecommand received in a packet compatible with a remote direct memoryaccess (RDMA) protocol to include one of Internet Wide Area RDMAprotocol (iWARP), Infiniband or RDMA over Converged Ethernet (RoCE). 8.The apparatus of claim 1, the command includes one of a flush command, awrite command, a read command, a write uncorrectable command or acompare command.
 9. The apparatus of claim 1, the storage device toinclude a hard disk drive (HDD) or a solid state drive (SSD), the SSDhaving non-volatile memory comprising at least one of 3-dimensionalcross-point memory, flash memory, ferroelectric memory,silicon-oxide-nitride-oxide-silicon (SONOS) memory, polymer memory,nanowire, ferroelectric transistor random access memory (FeTRAM orFeRAM), nanowire or electrically erasable programmable read-only memory(EEPROM).
 10. A method comprising: receiving, at a network input/outputdevice coupled to a server, a command for a client remote to the serverto access a storage device controlled by a Non-Volatile Memory Express(NVMe) controller maintained at the server; forwarding the command tothe NVMe controller with a first vendor defined message, the NVMecontroller to receive the command and execute the command based, atleast in part, on the first vendor defined message; and receiving acommand completion message with a second vendor defined message from theNVMe controller and forwarding a status of the executed command to theclient based, at least in part, on the second vendor defined message.11. The method of claim 10, comprising the network input/output device,the storage device and the NVMe controller arranged to operate incompliance with an industry standard to include PCIe Base Specification,revision 3.0 or NVMe Specification, revision 1.1.
 12. The method ofclaim 11, the first vendor defined message comprises flow controlinformation exchanged between the network input/output device and theNVMe controller and the second vendor defined message comprises updatedflow control information.
 13. The method of claim 10, the commandreceived in a packet compatible with a remote direct memory access(RDMA) protocol to include one of Internet Wide Area RDMA protocol(iWARP), Infiniband or RDMA over Converged Ethernet (RoCE).
 14. Themethod of claim 10, the command includes one of a flush command, a writecommand, a read command, a write uncorrectable command or a comparecommand.
 15. The method of claim 10, the storage device to include ahard disk drive or a solid state drive (SSD), the SSD havingnon-volatile memory comprising at least one of 3-dimensional cross-pointmemory, flash memory, ferroelectric memory,silicon-oxide-nitride-oxide-silicon (SONOS) memory, polymer memory,nanowire, ferroelectric transistor random access memory (FeTRAM orFeRAM), nanowire or electrically erasable programmable read-only memory(EEPROM).
 16. A method comprising: receiving, at a Non-Volatile MemoryExpress (NVMe) controller coupled to a server, a command forwarded by anetwork input/output device coupled to the server, the command for aclient remote to the server to access a storage device controlled by theNVMe controller; executing the command based, at least in part, on afirst vendor defined message received with the command; and sending acommand completion message with a second vendor defined message to thenetwork input/output device.
 17. The method of claim 16, the storagedevice and the NVMe controller arranged to operate in compliance with anindustry standard to include PCIe Base Specification, revision 3.0 orNVMe Specification, revision 1.1.
 18. The method of claim 16, the firstvendor defined message comprises flow control information exchangedbetween the network input/output device and the NVMe controller and thesecond vendor defined message comprises updated flow controlinformation, the network input/output device to forward a status of theexecuted command to the client based, at least in part, on the updatedflow control information.
 19. The method of claim 16, receiving thecommand via a manageability module coupled between the NVMe controllerand the network input/output device, the manageability module adding thefirst vendor defined message with the command forwarded from the networkinput/output device.
 20. The method of claim 19, the networkinput/output device, the manageability module, the storage device andthe NVMe controller arranged to operate in compliance with an industrystandard to include PCIe Base Specification, revision 3.0 or NVMeSpecification, revision 1.1, the first vendor defined message comprisesflow control information exchanged between the network input/outputdevice and the manageability module and the second vendor definedmessage comprises updated flow control information used by the networkinput/output device to forward a status of the executed command to theclient.
 21. At least one machine readable medium comprising a pluralityof instructions that in response to being executed on a networkinput/output device coupled to a server cause the network input/outputdevice to: receive a command for a client remote to the server to accessa storage device controlled by a Non-Volatile Memory Express (NVMe)controller maintained at the server, the command received in a packetcompatible with a remote direct memory access (RDMA) protocol to includeone of Internet Wide Area RDMA protocol (iWARP), Infiniband or RDMA overConverged Ethernet (RoCE); and forward the command to the NVMecontroller with a first vendor defined message in the command, the NVMecontroller to receive the command and execute the command based, atleast in part, on the first vendor defined message.
 22. The at least onemachine readable medium of claim 21, the command with the first vendordefined message received by a manageability module coupled between thenetwork input/output device and the NVMe controller, the manageabilitymodule to use the first vendor defined message to forward the command tothe NVMe controller via a command submission queue maintained by theNVMe controller, the manageability module to receive a commandcompletion message via a command completion queue maintained by the NVMecontroller and forward the command completion message to the networkinput/output device with a second vendor defined message, the pluralityof instructions to also cause the network input/output device to receivethe command completion message including the second vendor definedmessage and forward a status of the executed command to the clientbased, at least in part, on the second vendor defined message.
 23. Theat least one machine readable medium of claim 22, the networkinput/output device, the manageability module, the storage device andthe NVMe controller arranged to operate in compliance with an industrystandard to include PCIe Base Specification, revision 3.0 or NVMeSpecification, revision 1.1, the first vendor defined message comprisesflow control information exchanged between network input/output deviceand the manageability module and the second vendor defined messagecomprises updated flow control information.
 24. The at least one machinereadable medium of claim 21, the command includes one of a flushcommand, a write command, a read command, a write uncorrectable commandor a compare command.